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(54) TiUe: COMMUNICATIONS NETWORK MANAGEMENT 



(57) Abstract 



A communications network (1) such as 
a Global Multi Service Network is provided 
with a management system which comprises a 
distributed control system (4). The distributed 
control system (4) is an open community of 
co-operating intelligent software agents (5. 6) 
which individually have control, or responsibility 
for managing, one or more nodes (3) of the 
communications network (1). There are software 
agents of more than one type and the service 
management agents (5) which have control over 
nodes (3) of the network (1) enter a negotiation 
process with customer agents (6) in the provision 
of new services, so as to meet the constraints of 
both customer requirements and the interest of the 
relevant service provider. In the event of agent 
failure, the service management agents (5) initiate 
a bidding process to reallocate the responsibilities 
of a failed agent 
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COMMUNICATIONS T^^TWORK MAN^^f^NT 



The present invention reiares to communications 
networks and parriculariy zo the managemeni: thereof. 
5 Competitive advanrage can be gained by communications 

network operarors through the services that they offer and 
the efficiency with which they manage those services. 
Targets that a network operator might aim for include reduced 
charges, improved quality and increased customer control of 
10 services. Part of the nerworking infrastructure that might 
facilitate these customer offerings may well be the Global 
Multi-Service Networks (GMSNs) which enable network operators 
to offer their customers: ^ 

15 - Rapid service provisioning 

Controlled quality of service 
Integrated services 

Regulated control of network services^ 

20 Ideally, these facilities will be offered with the 

same availability as voice connectivity is today but , 
providing many new features together with mobility and 
movability of customers. 

To enable network operators to offer their customers 

25 the extensive flexibility, quality and control the above 
demands, GMSNs will need to support: 

Multi service provision 
Multiple vendors 
30 - Multiple administrators 

Flexible service management 

The complexity and operational, characteristics of 
GMSNs are expected to impose requirements beyond the 
3 5 capabilities of current network management approaches. Not 
only will the GMSNs have to provide services to -the customer 
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according -cBlf^nr racr bur price and performance will' have z.o 
be op-ciniised at: -he same rime for the nerwork operaror. 

A Multi -Servi ce Nerwork (MSN) is any network that is 
capable of supporting a range of services. The Pan-European 
5 Integrated Broadband Network investigated in a European RACE 
initiative, and referred to in the paper "Broadband 
Communication Managemenr - The RACE TMN Approach" presenred 
by R Smith at the lES Broadband Conference in London in 1990, 
is an example of a MSN based on the Asynchronous Transfer 

10 Mode. There are networks currently available in the USA 
which are examples of MSNs that use more conventional 
switches (e. g. DMS 250 from Northern Telecom). Such networks 
can be used to transmit voice as well as data. The data can 
be split into various transmission rates, for instance from 

15 19 kbits/sec up to 40 Mbits/sec, so that a range of services 
from file transfer to real rime video can be supported. 
Furrhermore, the trend in such networks is towards global 
networks where the MSN can span many countries, hence the 
emergence of GMSNs. 

20 Initially ar leasr, the intended customers for MSNs 

are expecred to be large corporare users, perhaps with many 
sites situated world-wide. Such a customer will require a 
nerwork which appears to be a private switched network, 
providing ar least the functionality that they enjoy from rhe 

25 international private leased circuits. In fact the service 
can be supported by a number of underlying networks, possibly 
fr.om many different network operators. This arrangement is 
known as a virtual network. 

30 Service Level Agreement fSLAs) 

These companies often entrust a large proportion of 
their world telecommunications requirements to one servxce 
provider by contract. It is extremely important that they 
are provided with the level of service specified in their 

35 contract. The exact definition of the service is specified 
in a Sejrvlce Level Agreement (SLA). The range of services 
available is potentially extremely l*:ge, and each service 
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10 



15 



can be further cusroitiised since each servicS has a range of 
oprions. Example services include: 

Dedicared innernarional privare leased circuits 
5 - Routing controlled by 

time of day 

calling identity 

originated location 
Customer 'controlled dialling plans 



An example of the latter is where a user needs only to 
dial 111 to get through to the relevant sales department, 
regardless of where the call is originated geographically in 
relation to the sales department. 

A SLA can be expected in general to include: 



Grade of Service (blocking probabilities, bit error 
rate, error free seconds etc. ) 
20 - Target and guaranteed minimum provision times 

Target. and guaranteed minimum cessation time 
Target and guaranteed minimum repair times 
Target and guaranteed service availability 

25 Working in object oriented software technology, models 

for services and SLAs have been developed by the 
International Standards bodies (OSI/NMF and CCITT). These 
provide Generic Managed Obje.ct classes that define services 
and SLAs. The concept of a feature Mc-naged Object is 

30 introduced to define a component of a service that can be 
offered to the customer. The logical numbering scheme 
permitted in Intelligent Networks is an example of such a 
feature. Features can be '* nested" so that one feature is a 
component of another feature. The mapping from the feature 

35 to the underlying network resources is also defined in the 
feature object. In an intelligent network of known type, 
having a structure including a service*control point (SCP)(or 
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:?cher m^nsj^&d^ making reference ro service and/or customer 
data, the service conrrol poinr (SCP) would typically be a 
resource .on which many features (e.g. logical numbering, 
T:ime-of-day rouring) depend. 
5 Information abour billing, fault handling and 

performance criteria may also be held within a feature, so 
long as it is common to all instances of that feature. It is 
possible for instance though that the performance criteria of 
some features will depend on the use to . which they are put. 

10 A SLA is then defined in terms of the component 

features that support the service in question. In addition 
to this, . information abour. the contract and a description of 
the service covered by the SLA is also kept. A SLA will 
typically refer ro a number of features, which in turn may 

15 refer to other features and resources. To support this 
relationship a number of dependency relationship types can be 
defined {supports, depends -on ere). 

Multi Service Network Management 
20 The customer is also likely to require the ability to 

manage their own virrual network: services can be requested, 

altered and ceased by the customer from on-line connections 

to the service provider' s equipmenr. 

All this complexity makes nerwork management an 
25 .extremely difficult matter, particularly where 

reconfiguration is required, and particularly in the light of 

SLAs. 

Providing Multi-Service capabilities across more than 
one country is likely to require considerable capital outlay. 
30 ,To make such a network viable the operating cost has to be 
kept within tight constraints. To meet this operating cost 
constraint, extensive automation of management functions in 
the network will be very attractive, if not essential. 

According to embodiments of the present invention, 
3 5 this automation will be achieved at least in part through the 
use of Cooperating Intelligent Software Agent technology. 
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The basis for such uechnology is des cribed 'in general terms 
in various publications including: 

i) Disrribured Artificial I nrelligence" by M Huhns , 
Volumes I and II, published by Pitman, Morgan, Kaufmann in 

5 1987; 

ii) . Fundamentals of Distributed Artificial Intelligence" 
by D G Griffiths and B K Purohit, published in British 
Telecommunications Technology Journal, Volume 9 No. 3, in 
July 1991; and 

10 iii ) "The Role of Intelligent Software Agents in Integrated 
Communications Management" by D G Griffiths and C Whitney, in 
the same issue of the British Telecommunications Technology 
Journal. 

The relevant content of each of the above is herein 
15 incorporated by reference. 

Particular aspects of network management which might 
be automated by means of embodiments of the present 
invention, together or separately, include the establishment 
and restoration of routes in an underlying physical network 
20 while maintaining customer 
requirements satis faction. 

Long Term Service Provisioning 

Service provisioning is a requirement of any 
25 telecommunications operator. Service provisioning for a GMSN 
tends to differ from conventional networks because of the 
following characteristics: 

A large range of services 
30 - A wide range of customer types 

Complex SLAs with financial penalties 
Network(s) spanning more than one country 

It is likely to be a requirement that when a customer 
35 requests that a new service be provided, they should receive 
a quote and an indication of timescales within a fixed time. 
The customer puts in a request for a ^ew service (possibly 
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via a mana^lm^n-c "cerminai for, existing cusromers, or through 
a nego-ciaror for new cusromers ) and will expecr to be rold 
how much the service is going to cosr and when it can be made 
available. If zhe service cannor .be supporred by -he 
5 existing nerwork configuration then some reconfiguration is 
clearly required and may well involve rhe provisioning of new 
equipment. 

Real-time Netwo rk Reconf iaurarion 

10 When a nerwork element fails, a number of services 

could be affecred. They could fail completely or they could 
fail parrially but fheir quality of service may drop below 
that defined in the cusromer SLA. When such faults occur, 
alternative ways (through network reconfiguration) musr be 

15 found for re-establishing the same service. 

In a convenrional network (e. g. as 'provided to date in 
the UK PSTN) such reconfiguration is controlled by routing 
tables in the switch (e.g. System-X exchanges). The switch 
automatically attempts to re-route around problems in the 

20 network through eonrrol actions from a central operations 
unit. This rouring rakes no direcr accounr of the type of 
traffic that is being roured and, as a result, all traffic is 
treated equally. 

In a more complex network (such as GMSNs ) , where there 

25 is a wide range of services and a large number of different 
customer types, this simple approach is not so viable. It is 
no longer safe to assume rhat all network usage is of equal 
importance. 

^ According to the present invention, there is provided 

30 a communications network management system comprising a 
distributed control system based on cooperating intelligent 
software agents, wherein reconfiguration of either the 
communications nerwork or of the agenrs can'^be carried out 
under the control of the agents. 
3 5 Such reconfiguration would be triggered, for insrance, 

by a request from a customer for a new service, or in the 
event of agent failure, 
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In rhe case of agenr failure, in parrx^uiar, it may be 
very imporrani: thar rhe reconfiguration be carried out very 
fasr so as zo maintain or reesrabiish services. It will also 
be imporranr that zhe conrrol sysrems refer to SLAs to see 
5 which services have priority in the face of pending or actual 
failure. Thus when a network fault occurs all (or all 
significant) affected services need to be detected and the 
consequence these have on agreed SLAs investigated. The 
broken SLAs will be ranked in order of urgency and the 
10 network reconfigured to restore service in such a way that 
minimises the consequences of the failure. 

An embodimen-c of the present invention can be 
described as an open heterogeneous system architecture based 
on autonomous software agents working cooperatively to solve 
15 a sub-set of service managemenr problems in a GMSN. The 
service management problems concerned might include the above 
mentioned real-time reconfiguration together with service 
provision in response to customer request. 

Embodiments of the present invention will now be 
20 described in more detail, by way of example only, with 
reference to the accompanying Figures, in which: 

Figure 1 shows a top level architecture for a GMSN 
together with a conrrol network therefor; 

Figure 2 shows the architecture of a software agent, 
25 specifically a service management agent 5, for use in the 
control network of Figure 1; 

Figure 3 shows the architecture of a software agent, 
specifically a customer agent 6, for use in the control 
network of Figure 1; 
30 Figure 4 shows a flow diagram for a negotiation 

process in service provision in a GMSN 1 as shown in Figure 
1; 

Figure 5 shows a flow diagram for a bidding process m 
the event of agent failure in the control network of Figure 
3 5 1 ; and 

Figure 6 shows the flow diagram of Figure 5 with some 
additional steps. * 
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Refer^ring ro Figure 1, a GMSN 1 generally comprises 
communicarion links 2 berween nerwork nodes or switches 3. 
Communi car ions occur along the communicarion links 2 in a 
combination determined by the conf igurarion at the nodes 3. 
5 The GMSN has an associated control network, 4 

comprising a plurality of computer systems, or software 
agents, 5,6. The software agents 5,6 are of two types, these 
being Service Management Agents (SMAs) 5, and Customer Agents 
(CAs) 6. Each" CA 6 is associated with a SMA 5 and acts to 
10 negotiate between a GMSN customer and a SMA 5 that might 
provide a service to that customer. 

Software agents 5, 6 can enter or leave the community 
they form the control network for. The main functions 
performed by the agents 5, 6 are: 

15 

• establishment and restoration of communications 
links 2 in the underlying GMSN . 1 

§ customer requirements satisfaction 

• re-esrablishment of GMSN control in case of agent 
20 failure. 

^ The establishment and restoration of links 2 is 

carried out by the SMAs 5 whereas customer requirements 
satisfaction is based on a process of dialogue and 
negotiation between a CA 6 and a SMA 5 acting as a service 
25 provider. 

Notably, "manning" for service provision and customer 
service negotiation is performed in a context of incomplete 
knowledge and constraining requirements. Embodiments of the 
present invention provide processes for the solution of these 

30 problems, notable features of which processes are that they 
are distributed and resilient to failure. The distributed 
aspect supports improved performance over a centralised 
system as there is scope for reducing the total amount of 
data passed to a central point and the inherent resilience of 

35 the distributed system permits graceful degradation. 

Conveniently, there may be one software agent, a SMA 
5, situated at each of the GMSN Sodes 3, each SMA 5 
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^lonitoring its underlying switch 3 as we±l as the links 2 
expended to the switch 2, Primarily, each SMA 5 controls 
jusr one switch 3 bur any given SMA 5 has the ability to 
conrrol a number of 3wiT:ches 3 simultaneously. Thau is, a 
5 SMA 5 is able to specify which incoming and outgoing 
communication links 2 a service will use. 
^ The software agents 5, 6 form a single layered system. 

The SMAs' responsibility is to provision customers' services 
by means of the current network resources and to maintain the 

lOLservices already installed. Thar is, when a communication 
link 2 fails, all the services using that link 2 will be 
affected and will need to have a new route, or combination of 
links 2, allocated to them. The control network 4 of 
software agenrs 5, 6 performs these functions through 

15 cooperation since *each has only local knowledge but must 
perform in a global context. 

Aaenrs' Archi tecture - 

20 (a) Service Maintenance Aaent fSMA) 5 

Referring to Figure 2, in order to play its role 
within the control nerwork 4, each SMA 5 has to have well 
structured knowledge and the capability to use that knowledge 
in cooperating with other agents 5, 6. Acting in a 

25 dynamically changing environment, a SMA 5 may evolve through 
various states 30. A state 30 is defined as an instance of 
agent's knowledge, created as a result of the agent's 
interaction with the physical environment and/or contact with 
other agents. The SMA' s knowledge may be partitioned into 

30 two categories, the agent's database 31 and the agent's 
working memory 32. The agent's database 31 carries 

descriptions of neighbouring agents' topology 33, local 
network topology 34 that the. relevant agent 5 is responsible 
for and a traffic profile 35. This latter describes services 

35 already installed which use the agent's local network. The 
agent' s working memory 3 2 consists mainly of queues of 
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niessages received 35 and ■ s enr- -3 7 by the agenr, which queues 
3 6, 37 arise during the solving of nerwork problems. 

Additionally, each SMA 5 has a ser of " message 
handlers" 38 thar enable rhe agenr' s merhods and algorithms 
(Generic Agent Code) to be triggered and used appropriately 
for each type of message. The Generic Agent Code includes: 



the agent' s knowledge evaluation and updating 
algorithm 

10 - a distributed rouring algorithm 

a customer service negotiation algorithm 

a "bidding" mechanism for use in reallocating 

control in the case of--agent failure. 

15 The agenr' s database 31 is constantly updated during 

an agent' s existence and is enhanced through contact with 
neighbouring SMAs 5 during problem solving sessions. Based 
on the messages it receives, such as alarms, partial route 
results, confirmation and reservation of circuits along a 

20 route in order to install a service, etc, each SMA 5 builds 
its own model 3 9 of the GMSN 1 and the services running on 
it. 

( b) Customer Aaent ^CA) 6 
25 Referring to Figure 3, in order to satisfy a 

customer' s requirements for a service, a second type of 
\. software agent, the CA 6, is provided. Each CA 6 is coupled 
with a SMA 5 and comprises, as a minimum subset, the 
following: 

30 

a friendly user-interface 60 

a data base 61 containing information about the 
range of services offered on the GMSN 1, tariffs 
and priorities 
35 - a strategy for negotiation 62 

CA-SMA communication protocol 63 
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The user interface 60 permits dialoguis with a customer 
so as ro achieve customer requirements capture, provision of 
advice to the .customer, for instance, on services, tariffs 
etc, customer/service provider mediation and accommodation of 
5 customer decisions such as change/modify requirements and 
s olution acceptance. 

The services database 61 contains information about 
the range of services that might be offered by a service 
provider on the GMSN 1 and other information reflecting that 

10 provider' s tariffs policy. It is updatable. 

The strategy for negotiation 62 may be implemented in 
either of at least two ways. Firstly, this might be by 
mediation between the customer and- the service provider, the 
customer taking all decisions. Alternatively, the customer 

15 might provide the service requirements and cost range he/she 
is able to accept, giving the CA 6 the freedom to negotiate 
for the best available service to satisfy those requirements 
^and cost range. 

The embodiment described below is an implementation 

20 which follows the first approach, the customer taking all the 
decisions and the CA 6 mediating between the customer and the 
service provider. The CA 6 acts in the interest of the 
customer who requires a service, and the customer may simply 
request the highest possible quality and priority, for 

25 minimal cost. The mediation requirement arises because the 
service provider, represented by a SMA 5, wants to establish 
the service using the minimum of network resources at minimum 
operating cost. A dialogue therefore arises between the CA 
6 and a relevant SMA 5 to reach a mutually acceptable 

3 0 agreement. This is carried out by the process of agent 
negotiation described in the following section. 

Customer Service provision through Aaent Negotiation, 

Before describing service provision in response to 
3 5 customer request, it is important to see how a service is 
modelled in the present embodiment of the invention, and to 
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know the ni^iTn assumptions made about: the services. The 
service definit-on is as follows: 

Service = ( Servi ceName, Cost, ?riori.ty; Bandwidth, 
5 Source, Destination) 

Optionally, the service definition might also include 
"Quality of Service". 

Notably --the service parameters Cost, Priority and 
10 Bandwidth are reconsidered and may be altered during the 
negotiation between a SMA 5 and a CA 6, prior to service 
^acceptance and instalment. This is further described later. 
The service assumptions are as follows: 

• a service is an end-to-end connection with a 
15 single path. No broadcast services are considered 

§ services are considered to be bi-directional, 
that is, traffic flows in both directions along the 
provisioned path 

• a service band-width is expressed in terms of the 
20 number of circuits required 

• a service band^width is constant, that is, not 
varying along its path or with time of day 

f services are prioritised on the basis of a 
oriority number that is determined beforehand (through 
25 negotiation) and never changes whilst the service is in 
operation 

§ the priority of a service is directly 
proportional to its selling price 

§ a lower priority service may be temporarily 
30 disturbed if another service with a higher priority requires 
some of the resources taken up by the lower priority service. 
This is necessary to form a cost effective route for the new 
service. 

t- On receiving a customer request for a new service, the 

35 CA 6 matches it against the range of available services 
offered by the service provider and builds a service 
specification which is handed over to»the SMA 5 responsible 
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for Che source node for thar particular service. The 
specification of the requested service takes the following 
form: 

Cusromer-Reques c = ( ServiceName, Cost*, Priority", 
5 Bandwidth*, Source, Destination) 

where * denotes initial value for those parameters. 

Again, optionally, the specification may^^include 
"Quality of Service". 
IQ When the SMA 5 responsible for the relevant source 

node receives the Customer-Request to provide a service, it 
will need to cooperate with the other SMAs 5 to find the most 
cost effective route from source to destination. The SMA 5 
who will be the. service provider to the customer in this 
15 context takes the Cusromer-Reques t and either initiates the 
process of generating routes or puts an entry in a pending 
queue of entries corresponding to each Customer Request and 
triggers a "watchdog" time-out to limit the total time 
j^waiting for a response. 
20 When the process of generating a route in initiated, 

route generation is done by using a distributed routing 
algorithm, examples of. which are known and hence not 
described in detail herein. Networks such as the one under 
consideration must be regarded as dynamic. That is, nodes 
25 and links may be added to or deleted from the system and 
capacity on any link may vary. The inclusion and handling 
of these constraints require algorithms that are highly 
adaptive to changes. It is to meet these requirements that 
a distributed routing algorithm to be performed by agents is 
30 found attractive. 

A . distributed routing algorithm can for instance 
involve expl.oring all paths but at the same time each SMA 5 
involved in developing a set of route (s) holds the cost of 
the least costly route so far developed and handed down to it 
35 via a forward message by another SMA 5. The SMAs would then 
compare the cost of partial routes being developed with that 
of the least costly route held. If a* partial route is more 
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expensive it: is abandoned as it, certainly does nor lead to a 
cost effective route. Otherwise, it proceeds to reach 
completion (towards reaching the destination) at which stage 
a backward message is directed along the route to the SMA 5 
5 that initiated the search. 

The network of SMAs 5 thus goes, into action to find a 
set of routes to satisfy the request entries and to return 
back to the SMA 5 who has become the service provider the 
prospective routes. Any of these routes may have the 
10 following structure: 

Route-Result = (Free-Cap, Cost, [Nj, . . . , Nj ] , [ (S Pj ) ,(S^,P,J] 

where "Free-Cap" is the global free capacity along the route 
15 and "Cost" is simply the cost for that route. 



Referring to Figure 1, each network node 3 might be 
separately numbered N,, N^, etc. Hence a route through the 
GMSN 1 can be expressed by listing the relevant nodes 3 
20 through which the route will pass. An example may thus be a 
route [N,, N^, Ng, ] . Looking at capacities available on the 
route links, that is free capacities, these might be as 
follows: 

25 Link-Capj5 = 30, Link-Capgg = 50, Link-Cap 33 = 20 

Free-Cap = min ( Link-Cap. Link-Cap^g, Link-Capg-) = 20. 



[N^,. . . Nj ] is the route given as .a list of nodes 3 from 
30 the source (N^) to the destination (N^) 

[is., Pj )/.-•/ (S,^, ^k)l a disruption list, that is, 
a list of all the services (Sj) with their priorities (Pj) 
that might be disrupted if the proposed new service were 
installed along that route. 
3 5 The prospective routes are subsequently listed in 

descending order with respect to Free-Cap. It should be 
noted that each route in the list necessarily satisfies 
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Source and Desrination entries in the ass^^^ated Cusromer- 
Requesr. It also necessarily satisfies Cosr entry (Cost ^ 
Cos-c ) . 

It niay be rhar whe firsr roure in the lisr (the one 
5 with the maximum Free-Cap) satisfies: 



Free-Capj ^ Bandwidth* 

In this :Case no other services need to be disrupted 
10 (the disruption list should be empty) and the load of the 
network with services is kept under control since the route 
with the maximum available capacity is to be chosen. Then SP 
gets the particular route Route, and sends a message to the 
CA 6 informing it abour this route in order to obtain the 
15 customer agreement to install the service on that particular 
route. 

If on the other hand 

Free-Cap; < Bandwidth* then a process of negotiation 
starts berween the CA 6 and SP. 
20 * If none of the routes has enough Free-Cap to satisfy 
the bandwidth required, the SP representing the company 
interest uses a decision function to choose the optimal route 
on which services may be disrupted. This decision function 
is described below. 



25 



30 



For Route;, i = l, / n, SP computes: 

-priority 



Mr 



HP, 



i = l V 



(Eq: 1) 



where, as pointed out above (P,, ,P^) are priorities of 

services (S,,..,S^) that must be disrupted if Routej is to be 
established. M, is the average net priority loss per service 
if services (S,, SJ are to be disrupted. 
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It^s .rational to Dick the route that minimises M: 
3P, therefore, initiares a loop that linearly searches 
through this list to gxve the route with minimum M; such that 
Priority > P., for any value of t from 1 to k, 
5 ie Vt = 1, ...,k (Relation **) 

If it does exist then services may be temporarily 
disrupted and the CA 6 is informed about the route chosen, 
waiting for the cusromer agreement to install the service. 

At this point it is imporrant to add thar in this 
10 implementation the SMAs 5 responsible for the links of the 
route that carry the services to be disrupted identify those 
services automatically and try to find alternative routes 
(route restoration) for them, if possible. If not, the SMA 
may renegotiate with the CA responsible for the disrupted 
15 services. This is to minimise the loss of revenue caused by 
the disruption of the lower priority services. 

Otherwise, SP presents to CA its best option (the 
route having the minimum Mj ) and at this point the customer 
may agree to lower his bandwidth requirement and accept the 
20 free-capacity available on the proposed route. . If the 
cusromer accepts the above deal his service is installed 
along the route with no disruption and therefore at no extra 
cost. 

Otherwise if the customer wants to keep his bandwidth 
25 requiremenrs in force, SP negotiates with CA on the basis of 
increasing the required service priority (Priority*'), 
Priority may for insrance be directly proportional to cost. 
For a higher priority service CA is expected to pay more. 

If CA accepts a new higher priority, the SP computes 
30 the extra cost that the customer needs to pay based on the 
average priority loss (Mj). The total cost of the service 
which is: 



35 



Total Cost = Cost* + ExtraCost 
reflects the increase of priority level 
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a" 

Priority = Priority* + ExtraPriorxty 

The ExtraPrlorlty is the amount to be added to 
Priority in order to satisfy "Relation given above. 

5 Then the same mechanism, for route restoration, described 
above, is applied for the disturbed services. 

A short summary of the negotiation process is given 
below, with reference to Fi.gure 4: 
START 

10 step 20: CA requests that a service be provided 

step 2 1, 22: SP chooses the optimal route available in the 
net work and determines the feasibility and cost of 
the service, and the services to be disrupted (if 
any ) . 

15 step 23: SP determines whether existing services will be 
disrupted. If not, the system goes to step 24. If 
they will be disrupted, the system goes to step 25. 
step 24: SP informs the CA about the proposed service (cost, 
route) and stops. 

20 step 25: SP checks if the service requested has a higher 
priority than the ones to be disrupted. If it does, 
the system goes back to step 24. If it does not have 
a higher priority, the system goes to step 26. 
step 26: SP negotiates with CA 

2 5 - to lower bandwidth requirements 

OR 

to increase the service priority (in this model 
priority is directly proportional to cost) 
step 27: a check is made as to whether the CA finds this to 
30 be reasonable. If it does, the system goes to step 

24. If not, the system goes to step 28. 
step 28: SP negotiates with CA to alter the service (step 28), 
then alters the technical service description and goes 
back to step 20. 
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This^is a simple example of a ^SMA-CA inreracrion 
orocess. However, there will be situations thar would demand 
more complexity. 

In -che foregoing, agenrs have been considered to be 
5 robust and failproof. This is not a realisric assumption, 
since it is entirely possible that agents could fail. The 
nexr section discusses how the remaining agents deal with 
agenr failure. 



10 -:^r,^NT FAILURE 

The system model described above consists of 
essentially two networks that interact - the underlying 
physical communications network (referred to as GMSN network 
1) and a network 4 of agents 5,6 whose function is to manage 

15 and control the' GMSN 1. To perform these functions the 
agenrs have certain responsibilities which in their mosr 
general form are of rwo kinds: Managerial and Contractual. 
As a Manager 5 the agenr has the responsibility of suitably 
controlling certain nodes 3 and links 2. As a Contractor 6, 

20 the agent must ensure prevision and maintenance of the 
services that have been agreed upon. 

The stability of the agent network 4 would initially 
be disrupted when a SMA 5 fails. In such circumstances the 
normal operation of the agent network 4 breaks down, since, 

25 in view of the agenr' s failure, its responsibilities are 
unartended thus giving rise to an' "abnormal" agent network 
behaviour. The abnormality lasts unless and until either the 
failed agent is revived or, if this option cannot be realised 
promprly , its responsibilities are suitably* allocated to its 

30 neighbours. Since SMAs' responsibility schedules are 

modified following the failure of any agents, the system gets 
renormalised at a new stability threshold. 

Concerning the stability threshold, each agent 2.S 
designed to work during its active life at a certain load 

3 5 level (number of queries to be solved) and it is able to 
manage theoretically any number of nodes 3. In reality there 
are limits beyond which the agent' s ccyitrol system might not 
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be able ro satisfy the performance criteria it was designed 
for. The stability threshold is the average load 

( conrracruai and managerial load) limit beyond which the 
sysrem is not able ro respond in a stable manner to the 
5 queries addressed to ir. 

Initially we assume zhere exists an isomorphism 
between the agent network 4 and the GMSN 1. Therefore, each 
SMA 5 manages its corresponding node 3 and possibly some of 
the links 2 incidenr upon the node 3. Let us assume the 
failure of a SMA 5 (call it A). The neighbouring SMAs will 
become aware of A' s failure (through detection of alarms) and 
thus take over A' s responsibilities in some fashion. To 
achieve this, a burst of communications takes place between 
SMAs 5 who know about A' s failure in order to negotiate on 
the allocation of A' s management and contractual 
responsibilities. 

The basis of negotiation among agents 5, 5 is a bid 
function (F) whose value is computed based on the current 
state of the network 4. In order to compute the bidding 
funcrion F four criteria have been considered. Based on 
those criteria the bidding funcrion F is a weighted sum- of 
some pre-computed parameters (one for each criterion): 

Wi C + R + W3 O + M 

where C, R,,... B and M are parameters computed for each 
criterion, as explained below, and Wj to^w^ are weights, of 
which W3 and w^ are negative. 

30 Notations: NA = neighbouring agent; FA = failed agent 

rPT^iiPRTQN T: CONNECTIVITY PARAMETER fC) 
Assumptions: 

(the more links a neighbouring agent NAj has connected 
3 5 to the FA'S nodes the greater is its connectivity C] 

(the greater the connectivity C the greater the chance 
of NA. to win the bid) * 
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c:-RTTERIQN I SERVICE RSSPONSI 31 LI TY PARAMETER ( R) 
Assumprions: 

[The more services a neighbouring agent NA; provisioned 
using rhe links thar were previously managed by rhe ?A rhe 
5 greater its responsibility R to supervise and maintain those 
services ) 

[The greater the responsibility R, the grearer the 
chances of NA, to win the bid] 

10 CRITERION III: OCCUPANCY (O) 
Assumption: 

[The more duties a neighbouring agent NAj has, that is, 
the greater the number of queries the agent has stored in its 
queue of incoming messages, the bigger its occupancy 0] 
15 [The greater the occupancy, the lesser are the chances 

for the NAj to win the bid) 

CRITERION TV: MANAGEMENT fM) 

20 Assumprion: 

[The more a neighbouring agent NAj is engaged as a 
manager M, that is, the greater the control it already 
exercises over nodes 3 and links 2 of the underlying network 
4, whe less availability it has to be the new manager with 
25 respecr to the FA' s nodes and links] 

[The greater the management engagement M, the lesser 
are the chances for the NA, to win the bid] 

Each SMA aware of A' s failure waits sufficiently long 
30 to receive messages from other SMAs and the agent with the 
highest bid function value takes over whatever 
responsibilities it has bid for. This whole process is 
triggered each time a SMA fails and proceeds until its 
responsibilities (both as a manager and a contractor) have 
3 5 been reallocated to the other SMAs aware of its failure. 

In an example of the above bidding process, referring 
again^ to the Bidding Funcrion F, the we*.ghts Wj, w^, W3 and w^ 
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can be tuned and are subjecr to expei?! mental results. 
However, some srrucrure can be imposed on w. Firstly"/ w, and 
are borh positive. Secondly, and are negative, due 
to their inhibitory effecr. Thirdly, the most dominant 
5 factor must be the connectivity parameter (C). Therefore the 
weight w, has been given the value 1. 

The other weights have been given the following 

values : 

W2 = 1/p- 

IQ where P = the average number of links connected to a 

node in the network (fan-out). 
W3 = -10/n 

where = the total number of nodes in the network, 
w^ can be tuned according to the ratio of agents to 
15 nodes. A normal range of values would be 0 to -1. The 
higher the ratio of agents to nodes, the closer w^ will 
approach to -1, this having the overall effect of spreading 
management responsibility amongst more agents by increasing 
the effect on F of M. An expression for w^ might for 

20 instance be "1/ cluster size", where the cluster size is the 
average number of nodes managed by one agent. This makes the 
bid function less sensitive to the real number of . nodes 
managed by one agent (M) when the average cluster size is 
anyway relatively high. 
2 5 Thus the bid function may be given as: 

F=C + 1/P R - 1/U 0 + w^ M 

SCENARIO 

We consider a 10 nodes network with the average fan- 
30 out of 4. A, B, C and D are four agents in the control layer 
and each of them is responsible for a number of nodes 3 as 
given below. Taking the case of the failure of agent A, 
while B, C and D are its neighbouring agents: 
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Failed aaenT: = A A 
Neighbours '= B, C and D 

Bidding Formula weighrs: w, = 1, = 1/4, w, = -1/10 

and w, = -0. 4 

5 

Aqpnrg= Details 
Agenr Name: B 

Connectivity (B' s nodes to A' s nodes) = 5 links 

Provisioned Services (with A' s cooperation) = 8 services 
10 Current Queries to be solved = 4 

Managed Nodes = 3 

Agenr Name: C 

Conneptivity (C s nodes to A' s nodes) = 2 links 

Provisioned Services (with A' s cooperation) =10 services 
15 Currenr Queries- to be solved = 5 

Managed Nodes =1 
Agenr Name: D 

Connectivity (B's nodes to A' s ^nodes ) = 5 links 

Provisioned Services, (with A' s cooperation) = 3 services 
20 Current Queries to be solved = 3 

Managed Nodes 



The resultant bidding values are: 

for agent B F= 5 + 0.25*8 - 0.1*4 - 0.4*3 = 5,4 

25 for agent C F= 2 + 0.25*10 - 0.1*5 - 0.4*1 = 3.6 

for agenr D F=. 5 + 0. 25*3 - 0. 1*3 - 0. 4*5 = 3. 45 

The conclusion is that agenr B is the winner, so it will take 
over the responsibilities of the failed agemr A. 

30 Referring to the Figure 5, the bidding process 

described above can be set out in the form of a flow chart. 
It is triggered when one agent becomes aware of a neighbour' s 
failure (step 40) through detection of alarms. The alarm 
mechanism may be seen as a simple and continuous check, in 

35 which periodically each agenr broadcasts a message to its 
neighbours and then compares the list of agents replying to 
this message against the list of the neighbours. An agent 
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niissing is considered "dead" if and only ir a link failure 
alarm (cur: conneccion) for the communication link with rhar 
agen-c has nor been received. The message forwarded to the 
neighbours may be used to update their knowledge {±. e. 
5 sending them the list of current neighbours will help them in 
the bidding process). 

The next step for each agent aware of agent A' s 
failure, is to compute the bidding function F, step 4 1, and 
to send out its* own bid value (step 42). Taking the example 
10 of an agent, agent B, it then goes into a cycle, steps 50, 
4 3, 44, 51, during which it waits for announcements and bids 
to be received from other neighbouring agents. At step 50, 
it checks its entry messages for bids received from the other 
neighbouring agents or for announcements of the winner. At 
15 step 43, it makes a decision as to whether the winner has 
been found. This could be because agent B has received an 
announcement of the winner from another agent, in its entry 
messages (step 50). If it has, it comes out of the cycle 
but, if not, it continues to step 44, which is preparatory to 
20 calculating the winner itself. That is, if all bids have 
been received from the other neighbouring agents, agent B 
will again come out of the cycle and this time compare the 
received bids, together with its own bid, to find the winner, 
step 46. If all bids have not yet been received, agent B 
25 will wait for a reasonable period, step 51, then return to 
the start of the cycle, step 50. 

The two routes out of the cycle, at steps 43 and 44, 
relate to the cases where another neighbouring agent has 
received all bids prior to agent B (step 43) and where agent 
30 B is apparently the first to receive all the bids and 
therefore finds the winner on its own account (steps 44, 46). 

A further test has to be made in either case, step 47, 
by agent B to assess whether it itself is the winner since as 
the winner it must assume the responsibilities of agent A. 
35 Thus if agent B finds at step 47 that it. is the winner, it 
will update its knowledge, step 49, consequently taking over 
the responsibilities of the failed agent A, send an 
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announcemen^T srep 53, to ail xhe other neighbouring agenrs 
involved in rhe bidding, and termina-ce the process, srep 54. 
If agent 3 is nor the winner, agenr 3 updates itself this 
time by assigning a pomrer "agenr A - Winner", srep 45, ro 
5 ensure it communicates with the winner in future rather than 
with agenr A, Again, agenr B then makes an announcemenr of 
the winner, srep 53, to all other neighbouring agents 
involved in the bidding, and terminates the process, srep 54, 
Referring to Figure 6, in an alternative version, the 
10 process steps followed by the agents might include additional 
checks which allow them to ensure they have updated 
themselves appropriately without repeating updating steps 49, 
4 5. 

In. this version, if agenr B knows the winner after 
15 reading its enrry messages (step 43), it goes to step 100, to 
make a check whether it has already updated its records in 
respect of a winner. If it has, it simply goes to STOP (step 
54). If it hasn't, it reverts to step 47, and continues 
substantially as in the version of Figure 5. In order to 
20 supply the information for step 100, however, after sreps 49 
or 45 (updating own knowledge or assigning a pointer) it sets 
a flag for itself, step 52, to show it has updated its 
records in respecr of a winner. 

The version of Figure 6 provides for the case where 
25 agenr B receives all bids (step 44), compares and finds the 
. winner (step- 46), then subsequently also receives an 
;.. announcement of the winner from another agent. In the 
version of Figure 6, the subsequent announcement will cause 
agent B simply to go to STOP (step 54) since the check at 
30 step 100 will show its records have already been updated. 

The version of Figure 6 will also deal with the case 
whether an agent receives an announcement of a winner from 
more than one neighbouring agent. Again, the extra logic of 
updating its own records can be avoided on receipt of the 
3 5 second (and subsequent) announcement ( s ) . 

There may of course be further alternative processes 
to the above, without departing from* an embodiment of the 
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oresenr invenrion. For xnsrance, the process sreps following 
START might include checks to aviod repearing a response to 
alarm messages which have already been dealt with. 

5 In the above description of a bidding process, certain 

assumptions have been made. These are as follows: 

1. An agent can communicate with any other agent in 
the community using direct or indirect 
c ommuni cati on. 

10 2. For the sake of simplicity, the bidding algorithm 

is designed to involve only neighbouring agents 
still connected through direct links of 
communication with the failed agent. Those 
neighbours whose communication links have been 
15 previously interrupted would not be able to 

"read" the agent's failure alarms but they can 
receive information about the final bidding 
decision taken by the agents directly connected 
witS^ the failed agent. 
20 The transfer of responsibilities away from the failed 

agent might be implemented in one of two ways. The winning 
agent might for instance gain access to the failed agent' s 
data base from where it is able to extract the information 
previously owned by the failed agent. This of course is 
25 based on the assumption of a valid/accessible data base. A 
second approach is based on the idea of the "winner" re- 
building the information stored in the failed agent' s data 
base (that is currently not available) through dialogue with 
the other neighbours of the failed agent. Using this 
30 approach it is still possible to recover information, such as 
connectivity-nodes and links, installed services on failed 
agent' s links etc. 

SERVICE T ^KP'^^^^'^TQN 
35 The above describes the response to agent failure in 

the control layer 4. However nodes 3 and links 2 of the 
underlying GMSNl may also fail. When a node 3 fails to 
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onerare,.. ail the, links 2 incident upon it fail to operare. 
Hence node failure is equivalent to multi-link failure and 
thus resolves to rhe more basic case of link failure. It 
therefore suffices uo consider rhe problem of link failure. 
5 All the services "running" along the failed link should be 
detected and re-routed. Re-routing an existing service can 
itself be regarded as a type of service provisioning. . 

The restoration procedure adopted applies the same 
branch and bound" routing procedure used for service provision 
10 ^and referred. to above. 

The failure of a network link 2 causes an alarm 
message to be sent automatically to the SMA 5 responsible for 
the link. The SMA 5 then identifies the affected services to 
be re-routed and places them in its queue to be dealt with in 
15 order of their priority. The SMA 5, sends a re^^routing 
request which is similar to the request for a new service 
(already described), to its neighbours asking them to provide 
alternative routes around the failed link. The re-routed 
service is the establishment of a route with the capacity of 
20 the disrupted service from the origin (where disruption 
occurs) to destination (where disruption ends). When the 
results have been returned to the SMA responsible for the 
failed link, the lowest cost route is chosen. 

In some cases there may not be an alternative route 
2 5 for the service. This may be because: 

• - there are no alternative routes with the required 
r capacity; 

- the cost limit was too stringent; 

- insufficient search time was allowed. 

30 Whatever the case, the originating SMA (the agent to 

which a link failure was signalTled) must decide what to do. 
One simple course of action could be . to relax the cost 
constraints and try again. 

It should be noted that although in Figure 1 there is 

35 shown a 1: 1 relationship between the SMAs 5 and the nodes 3 
of the GMSN 1, this is not necessarily the case. Indeed it 
is more likely to be found more efficient that there are 
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fewer SMAs 5 than there are nodes 3, each SMA 5 therefore 
controlling more than one node 3 . 

It should also be noted that, in the example of the 
present invention described above, where there is failure of 
5 Agent A and Agent B is the winner, Agent B takes over the 
responsibilities of Agent A. In practice, it may be found 
more efficient that the responsibilities of Agent A are 
distributed amongst more than one other agent and the bidding 
process may therefore alternatively be designed such that 

10 remaining agents bid only for selected responsibilities of a 
failed agent, or that each neighbouring agent puts forward 
different ..bid functions in respect of different^ 
responsibilities of the failed agent. 

In this specification, the term "co-operating 

15 intelligent software agents" is used. Without limiting the 
understanding of a person skilled in the relevant technology, 
for the purposes of this specification a co-operating ' 
intelligent software agent can be considered to be a software 
entity capable of performing the functions set out, as far as 

20 necessary, in an embodiment of the present invention. A 
relevant software entity would probably therefore comprise a 
data store, or access to a data store, at least some data (or 
access to some data) which is local to the software entity 
rather than global with respect to the communications 

25 network (s), intelligence in that it can make decisions and 
act on them, communications means for communicating with 
other agents, control outputs for issuing control signals to 
allocated nodes, and updating means for updating its data. 
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1. A communica-cions network management system, for 
managing a nerwork which comprises a plurality of nodes 

5 connected by traffic links and wherein communication services 
can be provided to customers according to predetermined 
service' parameters by allocating selected links and nodes to 
said services on a priority basis, the management system 
comprising a di's tribured control system based on co-operating 

10 intelligent software agents, said software agents 
individually having control over the configuration of one or 
more allocated nodes of said plurality of nodes, and thereby 
ha:ving control , with respect to communications services 
provided via said allocated node or nodes, 

15 wherein a change in communication services provided by 

said network can be established in response to customer 
requesr by reconfiguration of one of more of said plurality 
of nodes by means of control output issued by the software 
agent or agents associated with that node or nodes, 

20 subsequent to a decision-making process initiated amongst 
said agenrs and based on parameters of said customer request 
modified in accordance with said priority basis. 

2. A communications network management system, for 
25 managing a network which comprises a plurality of nodes 

connected by traffic links, wherein communication services 
can be provided to customers according to predetermined 
service parameters by allocating selected links and nodes to 
said services on a priority basis, the management system 

30 comprising a distributed control system based on co-operating 
intelligent software agents, said software agents 
individually having control over the configuration of one or 
more allocated nodes of said plurality of nodes, and thereby 
having control with respect to communication services 

35 provided via said allocated node or nodes, 

wherein, on failure of an agent, one or more 
neighbouring agents are alerted to sai^ failure and initiate 
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a bidding process, each of said neighbouring agenrs putting 
forward a bid value based on parameters weighted so as to 
give, in combination, an estimate of that agent' s suitability 
ro take over some or all responsibilities of said failed 
5 agent, the neighbouring agent putting forward a winning bid 
value thereafter asserting said responsibilities. 



3. A network management system according to either one of 
the preceding claims, wherein there is more than one type of 

10 software agent, there being provided service management 
agents which have direcr control outputs to one or more of 
said nodes of the network, and customer agents, each of which 
customer agents is associated with at least one service 
management agent, but has no direct control output to a node 

15 of the nerwork. 

4. A network management system according to claims 2 and 
3, wherein the bid value " F" for a neighbouring agent is 
calculated according to the function: 

20 F = Wj C + w^ R + W3 C' + w^ M 

where C, R, O and M are parameters computed in respect of 
connectivity, service responsibility, occupancy and 
management load for that neighbouring agent, and w^, w^, W3 
and w^ are weighting factors, w, and w^ being negative. 
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5, A network management system according to claim 4, 

wherein connectivity is allocated the greatest weighting 
factor. 



6. A method of reconfiguring a communications network in 

response to a requirement for a change in communications 
services available by means of said network, wherein the 
network comprises a plurality of reconf igurable nodes 
connected by links for carrying traffic, and wherein there is 
35 provided a management system comprising a community of co- 
operating software agents having management control over the 
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configuration of aiiocatzed nodes of said nerwork, the method 
comprising: 

i) initiation of a negotiation process amongst ar least 
some of said software aaen-zs , the negotiation process 

5 being based on constraints including relative 

priorities allocated to said communication services; 

ii) outputting a reconfiguration control output from one 
or more of said software agents to one or more nodes 
of the network in accordance with the outcome of said 

10 negotiation process. 

7. A method according to claim 6, wherein said change in 

communication services comprises new service provision. 

15 8. A method according to claim 6, wherein said change in 

communication services comprises reconfiguration of said 
network m order to reinstate services subsequent to failure 
of one or more elements of said network. 

20 9. A method of managing a communications network, said 

network comprising a plurality of nodes connected by links 
for carrying communications traffic, and wherein there is* 
provided a management system comprising a community of 
software agents, individual ones of which control outputs to 

2 5 one or more allocated nodes of the network, 

wherein, on failure of a software agent, a bidding 
orocess is initiated in said community of software agents, at 
least one agent outputting a bid function F representing 
weighted values of parameters relevant to that agent in 

30 respect of taking over the responsibility of the failed 
agent, and, on completion of said bidding process, the agent 
which has output the most favourable bid function F assumes 
one or more responsibilities of said failed agent. 



35 



10, A network management system according to any one of 
claims 1, 2, 3, 4 or 5 wherein each software agent having 
control over the configuration of one ^r more nodes of the 
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network has an associa-ced database comprising data which is 
incomplete with respect of the network as a whole, but 
complete in respect of local data enabling the agent to 
exercise said control. 

5 

11. A method according to claim 9, wherein the agent which 
assumes the responsibilities of said failed agent downloads 
data from said failed agent as an initial step in said 
assumption of responsibilities. 
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